Glottal Waveforms for Speaker Inference & A Regression Score Post-Processing Method Applicable to General Classification Problems
نویسندگان
چکیده
Contributions are made along two main lines. Firstly a method is proposed for using a regression model to learn relationships within the scores of a machine learning classifier, which can then be applied to future classifier output for the purpose of improving recognition accuracy. The method is termed r-norm and strong empirical results are obtained from its application to several text-independent automatic speaker recognition tasks. Secondly the glottal waveform describing the flow of air through the glottis during voiced phonation is modelled for the task of inferring speaker identity. A prosody normalised glottal flow derivative feature termed a source-frame is proposed with empirical evidence presented for its utility in differentiating speakers. Inferences are also made from the glottal flow signal regarding detection of the affective disorder depression. Comprehensive literature reviews of the fields of automatic speaker recognition, forensic voice comparison and the estimation of the glottal waveform are also presented.
منابع مشابه
Methods for estimation of glottal pulses waveforms exciting voiced speech
Nowadays, the most popular techniques of the speech processing are the recognition of all kinds (the speech, the speaker and the state of speaker recog.) and the text-to-speech synthesis. In both these domains, there are possibilities to use the glottal pulses waveforms. In the recognition techniques we can use them for the vocal cords description and then use it for the classification of speak...
متن کاملClassification-Based Detection of Glottal Closure Instants from Speech Signals
In this paper a classification-based method for the automatic detection of glottal closure instants (GCIs) from the speech signal is proposed. Peaks in the speech waveforms are taken as candidates for GCI placements. A classification framework is used to train a classification model and to classify whether or not a peak corresponds to the GCI. We show that the detection accuracy in terms of F1 ...
متن کاملA Review of Glottal Waveform Analysis
Glottal inverse filtering is of potential use in a wide range of speech processing applications. As the process of voice production is, to a first order approximation, a source-filter process, then obtaining source and filter components provides for a flexible representation of the speech signal for use in processing applications. In certain applications the desire for accurate inverse filterin...
متن کاملR-norm: improving inter-speaker variability modelling at the score level via regression score normalisation
This paper presents a new method of score post-processing which utilises previously hidden relationships among client models and test probes that are found within the scores produced by an automatic speaker recognition system. We suggest the name r-Norm (for Regression Normalisation) for the method, which can be viewed as both a score normalisation process and as a novel and improved modelling ...
متن کاملSpeaker Identification Using Glottal-Source Waveforms and Support-Vector-Machine Modelling
Speaker identification experiments are performed with novel features representative of the glottal source waveform. These are derived from closed-phase analysis and inverse filtering. Source waveforms are segmented into two consecutive periods and normalised in prosody, forming so called source-frame feature vectors. Support-vector-machines are used to construct speaker discriminative hyperplan...
متن کامل